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ABSTRACT 


The  goal  of  this  research  is  to  develop  a  robust  control  strategy  for  constructing  image  under¬ 
standing  systems  (IUS).  This  paper  proposes  a  general  framework  based  on  the  integration  of  “re¬ 
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cluster.  The  goal  of  the  IUS  is  to  verify  the  hypotheses.  We  constructed  an  image  understanding 
system,  SIGMA,  based  on  this  framework  and  demonstrated  its  performance  on  an  aerial  image  of 
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1.  Introduction 


A  primary  objective  in  computer  vision  research  is  to  construct  image 
understanding  systems  (IUS’s)  which  can  analyze  images  based  on  object 
models.  Usually,  an  IUS  analyzes  images  by  constructing  interpretations  in 
terms  of  the  object  models  given  to  the  IUS.  Interpretation  refers  to  the  map¬ 
ping  between  objects  (e.g.,  houses,  roads)  in  the  object  model  and  image 
structures  (e.g.,  regions,  lines,  points)  in  the  image.  During  the  analysis,  an 
IUS  needs  to  perform  the  following  two  types  of  tasks: 

-  segmentation  :  the  task  of  grouping  pixels  together  to  construct 
image  structures  that  can  be  associated  with  objects  in  the  given 
model. 

-  interpretation  :  the  task  of  constructing  mappings  between  image 
structures  and  objects. 

Segmentation  is  practical  when  sufficient  knowledge  is  available  about  the 
image  to  be  processed  and  the  image  structures  to  be  computed.  The  base  of 
knowledge  increases  as  the  interpretation  process  develops,  leading  to  more 
constrained  and  therefore  more  reliable  segmentation. 

Many  IUS’s  were  constructed  in  the  late  1970’s  (  [Barr8l],  [Ball82], 
[Binf82]  [Ball82].)  Most  systems  integrate  segmentation  and  interpretation 
using  one  of  the  following  types  of  analysis. 
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1)  Bottom-up  analysis:  the  image  structures  are  extracted  from  the 
image,  and  are  interpreted  as  instances  of  the  objects  in  the  model. 

For  example,  when  a  large  rectangular  region  is  extracted,  inter¬ 
pret  it  as  a  house. 

2)  Top-down  analysis:  the  appearance  of  the  object  is  first  deter¬ 
mined,  and  the  associated  image  structures  are  extracted.  For  ex¬ 
ample,  suppose  an  IUS  wants  to  find  a  house;  the  IUS  invokes  the 
house  model  and  establishes  the  descriptions  of  the  specific  image 
structures  to  be  extracted  from  the  image. 

It  is  generally  accepted  that  image  understanding  systems  should  incorporate 
both  bottom-up  and  top-down  analyses.  Some  systems  use  only  one  type  of 
analysis.  MSYS  [Barr76]  developed  by  Barrow  and  Tenenbaum  used  bottom- 
up  analysis.  Image  structures  are  first  segmented  from  the  image.  A  set  of 
initial  labels  are  assigned  to  these  image  structures  (based  on  height,  homo¬ 
geneity,  etc.)  Then,  geometric  constraints  between  labels  are  used  to  filter  out 
inconsistent  labelings.  Bolles  [Boil76j,  on  the  other  hand,  used  top-down 
analysis.  In  his  system  ,  a  goal  is  first  constructed.  The  system  then  matches 
the  goal,  which  is  represented  as  a  template,  with  the  image.  A  similar 
approach  is  used  in  Garvey’s  [Garv76]  system.  Other  systems  (Hanson,  Rise- 
man  [Hans78];  Matsuyama  [Naga80])  incorporate  both  types  of  analysis  but 
use  ad  hoc  rules  to  determine  which  type  of  analysis  is  to  be  used  at  what 
stage  during  the  analysis.  Such  systems  often.  require  a  large  set  of  domain 
dependent  control  knowledge  to  direct  the  analysis  of  the  IUS. 


It  is  the  goal  of  this  research  to  develop  a  robust  control  strategy  for  con¬ 
structing  image  understanding  systems,  thus  eliminating  the  need  to  use  large 
amounts  of  domain  specific  control  knowledge  in  specific  applications.  In  this 
paper,  we  propose  a  general  framework  which  enables  IUS’s  to  integrate  both 
bottom-up  and  top-down  analyses  into  a  single  flexible  reasoning  process.  We 
construct  an  image  understanding  system,  SIGMA,  based  on  this  framework 
and  provide  demonstrations  of  its  performance  on  images  of  a  suburban  hous¬ 
ing  development. 

1.1.  Integration  of  hypotheses 

Considering  the  following  proposition: 

If  a  structure  of  type  x  is  present  in  the  scene  having  certain  spa¬ 
tial  properties,  then  there  should  exist  a  structure  of  type  y  having 
certain  properties  in  the  image. 

It  is  often  the  case  that  what  is  known  about  x  is  not  sufficient  to  completely 
characterize  y  (i.e.,  we  might  be  able  to  predict  its  size  and  color,  but  perhaps 
not  its  orientation).  In  addition,  there  might  be  many  x’s,  each  predicting  the 
occurrence  of  y,  but  each  contributing  different  constraints  on  the  properties 
of  y. 

For  example,  by  locating  a  house  in  the  image,  one  may  predict  the 
occurrences  of  other  objects,  e.g.,  neighboring  houses.  Furthermore,  the 
discovery  of  a  rectangular  homogeneous  region  in  the  image  may  also  generate 


a  prediction  of  a  house.  It  is  usually  the  case  (depending  on  the  object  model) 
that  each  of  these  predictions  provides  some  “cues”  about  the  occurrence  of  a 
house  and  it  is  the  integration  of  all  these  cues  that  may  characterizes  the 
occurrence  of  a  house  adequately  enough  to  easily  recognize  it. 

Let  us  call  the  predictions  about  the  occurrences  of  objects  in  the  image 
hypotheses.  Suppose  several  hypotheses,  which  may  be  independently  gen¬ 
erated,  are  predictions  about  objects  at  the  same  location  in  the  image.  It  is 
reasonable  to  assume  that  these  hypotheses  are  predictions  about  the  “same” 
object,  although  each  may  only  constrain  some  subset  of  the  properties  of  the 
object.  By  integrating  these  hypotheses,  an  IUS  could  construct  a  more  com¬ 
plete  description  of  the  object  and  use  it  to  direct  a  more  effective  and 
informed  analysis. 

1.2.  An  overview  of  the  SIGMA  image  understanding  system 

Figure  1-2  shows  the  system  architecture  of  the  SIGMA  image  under¬ 
standing  system.  The  user  provides  object  models  to  SIGMA,  and  the  results 
of  the  analysis  are  available  to  the  user  through  a  query-answering  module. 

The  image  is  first  segmented  by  a  general  purpose  low  level  vision  system 
(LLVS).  The  segmentation  results  are  recorded  in  the  iconic/symbolic  data¬ 
base.  The  high  level  vision  system  (HLVS)  uses  the  object  model  either  to 
interpret  image  structures  already  extracted  or  to  direct  the  low  level 
processes  to  search  for  image  structures  not  yet  discovered.  During  the 


analysis,  the  HLVS  incrementally  constructs  an  interpretation  network  for  the 
input  image.  A  “goal”  is  given  to  the  query-answering  module  (QAM).  At 
the  end  of  each  analysis  iteration,  the  QAM  is  activated  and  “matches”  the 
current  status  of  the  analysis  with  the  goal.  This  construction  process  contin¬ 
ues  until  the  “goal”  is  accomplished  (i.e.,  a  successful  match  between  the 
current  status  of  the  analysis  and  the  goal)  or  no  more  interpretations  can  be 
constructed.  At  this  stage,  the  QAM  provides  the  current  status  of  the 
analysis.  In  the  following  subsections,  we  present  each  module  of  SIGMA  in 
more  detail. 

1.2.1.  The  low  level  vision  system 

In  SIGMA,  the  LLVS  is  formulated  as  a  domain-independent  goal- 
directed  segmentation  system.  A  goal,  which  is  described  by  a  list  of  con¬ 
straints  on  the  image  structures  to  be  computed,  is  given  to  the  LLVS.  The 
LLVS  uses  general  segmentation  techniques  to  extract  such  image  structures. 
Other  systems  have  been  constructed  to  perform  goal-directed  segmentation  - 
e.g.,  Selfridge  [Self82]  and  Nazif  Sc  Levine  [Nazi84]. 

Our  approach  differs  from  the  approaches  taken  in  these  systems.  We 
assume  that  many  specialized  methods  are  needed  to  extract  image  features 
from  the  image.  An  I^LVS  needs  to  select,  from  a  pool,  methods  that  best  suit 
the  task.  Furthermore,  new  methods  are  frequently  developed  that  can  aug¬ 
ment  or  replace  the  methods  currently  used  by  the  LLVS.  It  is  important  to 


design  an  LLVS  so  that  adding  methods  to  it  is  easy. 

Our  LLVS  is  based  on  a  select-and-schedule  strategy.  When  the  LLVS  is 
asked  to  verify  some  hypothesis,  it  first  selects  those  methods  which  are  appli¬ 
cable  by  matching  the  hypothesis  against  a  decision  table.  Then,  the  LLVS 
schedules  the  selected  methods  according  to  their  potential.  If  one  method 
fails  to  verify  the  hypothesis,  the  next  method  will  be  tried  until  the 

hypothesis  is  verified  or  until  all  methods  have  been  tried  and  have  failed. 
This  approach  is  similar  to  the  “blackboard”  method  [Davi77]  and  the  “con¬ 
tract  net”  idea  [Smit78];  but  the  implementation  here  is  simpler.  For  a 
detailed  discussion  of  the  LLVS,  see  [Hwan84]. 

1.2.2.  The  high  level  vision  system 

The  high  level  vision  system  (HLVS)  uses  object  models  to  interpret  data 
recorded  in  the  iconic/symbolic  database  and  construct  an  interpretation  net¬ 
work.  The  HLVS  uses  the  integration  of  hypotheses  principle  to  direct 
analysis.  This  is  implemented  by  the  following  reasoning  steps. 

1)  Hypothesis  generation:  the  HLVS  generates  hypotheses  about 

occurrences  of  objects  in  the  image. 

2)  Hypothesis  integration:  the  HLVS  clusters  “related”  hypotheses 

together. 

3)  Hypothesis  abstraction:  the  HLVS  computes  a  “composite  hypothesis” 
for  each  cluster. 

•4)  Hypothesis  verification:  the  HLVS  selects  hypotheses  and  verifies  them 
by  computing  values  for  those  attributes  which  are  not  completely 
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constrained. 


The  HLVS  performs  the  reasoning  iteratively.  At  the  end  of  each  itera¬ 
tion,  the  HLVS  checks  whether  the  “goal”  is  accomplished  by  activating  the 
QAM.  If  the  goal  is  accomplished  or  no  more  interpretations  can  be  con¬ 
structed,  the  construction  process  terminates  and  the  status  of  the  analysis  is 
available  through  the  QAM. 

1.2.3.  Query-answering  module 

Potentially,  SIGMA  constructs  all  possible  interpretations  for  an  image. 
However,  SIGMA  needs  to  select,  among  many  interpretations,  a  good  one  as 
its  conclusion.  Instead  of  finding  a  “best  interpretation”,  we  model  this  selec¬ 
tion  process  as  a  database  query  answering  process.  A  program  (QAM)  was 
developed  to  answer  simple  queries  about  the  interpretation  network  and  to 
display  the  associated  image  structures. 

The  goal  of  the  analysis  is  provided  to  the  QAM  as  a  query.  Whenever 
the  QAM  is  activated  (by  the  HLVS),  it  matches  the  goal  with  the  interpreta¬ 
tions  already  constructed.  If  any  interpretation  that  satisfies  the  goal  is 
found,  the  QAM  enters  into  an  answer  mode  and  provides  a  query-answering 
capability  for  selecting  “good  interpretations”  and  displaying  the  explanations 
for  these  interpretations. 
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1.3.  Outline  of  the  paper 


We  first  present  the  knowledge  representation  paradigm  used  in  SIGMA. 
In  Section  3,  we  discuss  a  framework  for  performing  hypothesis  integration 
and  abstraction.  This  is  followed  by  a  detailed  description  of  the  system  con¬ 
structed  based  on  this  framework.  Conclusions  are  presented  in  the  final  sec¬ 
tion. 
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2.  Representation  of  object  models 
2.1.  What  to  represent? 

The  knowledge  representation  formalism  determines  a  general  framework 
for  organizing  the  necessary  knowledge  into  a  knowledge  base  and  supports  a 
powerful  inference  mechanism  for  guiding  the  recognition  of  a  specific  scene. 
An  appropriate  knowledge  representation  tool  can  often  simplify  the  task  of 
transferring  problem  domain  expert  knowledge  into  knowledge  bases  in  com¬ 
puter  systems. 

Consider  the  following  house  model: 


A  house  is  “rectangular”  or  “L-shaped”;  its  area  is  larger  than 
1000  square  feet  but  no  larger  than  2500  square  feet.  A  house  usu¬ 
ally  belongs  to  a  group  of  houses  which  are  on  the  same  side  of  a 
road.  Roads  can  be  found  near  the  house.  Usually,  the  road  is 
parallel  or  perpendicular  to  the  house  and  a  driveway  connects  the 
road  to  the  house. 


Based  on  how  an  IUS  uses  such  a  model  to  locate  houses  in  a  given  image,  one 
can  categorize  this  scene  knowledge  into  the  following  classes. 

1)  What  to  look  for.  This  class  of  knowledge  describes  the  appearances  of 
objects  (e.g.,  the  type  of  image  structures  associated  with  objects.)  In  the 
house  example,  the  appearance  of  the  house  is  a  homogeneous  compact  rec¬ 
tangular  region.  To  locate  houses,  an  IUS  segments  the  input  image  and 
identifies  as  houses  those  regions  which  are  rectangular  and  compact  and 
whose  sizes  are  between  1000  and  2500  square  feet. 

2)  Where  to  look.  This  class  of  knowledge  includes  the  geometric  and  topolog¬ 
ical  relations  between  objects.  The  knowledge  base  might,  for  example, 
specify  (based  on  connectivity,  relative  orientation,  etc.)  relations  between 


’riveways,  houses,  and  roads.  An  IUS  might,  if  one  of  these  objects  is 
discovered  (say  a  driveway),  use  this  relation  to  initiate  and  constrain  the 
search  for  other  objects  (e.g.,  a  connected  house  and  road)  not  yet  discovered. 
An  IUS  might  also  use  such  relations  to  examine  whether  a  house,  a  driveway, 
or  a  road  already  discovered  satisfy  the  required  relations. 

3)  When  to  look.  This  class  of  knowledge  describes  strategies  regarding  the 
application  and  confirmation  of  relations.  One  the  one  hand,  we  often  want  to 
postpone  applying  a  specific  piece  of  relational  knowledge  until  sufficient 
information  has  been  obtained  to  strongly  suggest  that  the  relation  may  be 
applicable.  On  the  other  hand,  since  the  confirmation  process  often  involves 
the  searching  of  image  structures  associated  with  other  objects,  we  might  also 
want  to  postpone  the  confirmation  of  a  specific  relation  until  a  sufficient 
description  of  the  object  to  be  searched  is  collected.  For  example,  when  the 
IUS  generates  a  house  hypothesis,  instead  of  searching  for  an  image  structure 
associated  with  it  immediately,  the  IUS  might  postpone  the  search  until  a 
sufficient  description  of  the  house  (e.g.,  shape,  intensity,  etc.)  is  available. 

A  principal  objective  of  this  research  is  to  develop  a  representation 
scheme  which  simplifies  the  task  of  capturing  domain  knowledge  as  a 
knowledge  base  for  IUS’s.  This  section  presents  the  knowledge  representation 
scheme  used  in  the  SIGMA  system.  Note  that  the  scene  model  is  used  mainly 
by  the  HLVS  (High  Level  Vision  System)  module  in  SIGMA. 

2.2.  Basic  representation  primitives 

Our  representation  formalism  is  based  on  frame  system  theory  [Mins75], 
semantic  networks  [Wino75j  [Hend79],  and  an  object  oriented  problem  solving 
style  [Stee79]  [Wein80]  [Gold83|.  In  SIGMA,  object  models  are  represented  as 
a  graph  structure  of  nodes  and  arcs.  Objects  are  described  by  “frames”  (nodes 
in  the  graph  structure)  while  relations  between  these  objects  are  described  by 
“rules”  and  “links”  (arcs  in  the  graph  structure).  In  such  a  formalism,  domain 


knowledge  is  built  around  a  set  of  objects  and  a  set  of  operations  that  can  be 
applied  to  them. 

The  basic  entities  of  the  representation  are  called  frames  and  are  used  to 
model  abstract  objects  in  the  problem  domain  such  as  “house”  or  “road”. 
Each  frame  may  have  many  associated  descriptions  that  are  defined  by  slots. 
Slots  are  similar  to  “property  lists”  in  LISP.  Each  slot  is  a  list  which  contains 
an  indicator  (i.e.,  name)  and  a  value. 

In  addition  to  slots  where  values  are  recorded,  we  can  also  associate  with 
frames  all  the  knowledge  which  is  used  to  compute  values  of  slots.  We 
represent  this  type  of  knowledge  as  rules. 

Rules  used  in  this  context  are  procedural-i.e.,  the  knowledge  about  how 
to  compute  values  of  slots  is  encoded  in  programs.  As  mentioned  above,  these 
“programs”  are  written  using  an  object-oriented  programming  style. 

Objects  in  the  scene  domain  are  often  structured  into  hierarchies.  It  is 
often  natural  and  convenient  to  preserve  these  hierarchies  when  we  construct 
the  scene  model.  Links  are  used  to  describe  the  hierarchical  relations  between 
objects. 

One  object  hierarchy  often  used  is  the  generalization/specialization 
hierarchy;  CAN-BE  and  AKO  links  are  employed  to  describe  it.  Link  CAN- 
BE  describes  a  frame  and  its  specializations  while  link  AKO  describes  a 
frame  and  its  generalizations. 


Properties  are  inherited  through  the  AKO  link.  This  usage  is  similar  to 
the  “property  inheritance”  in  semantic  networks  (  [Moor79],  [Nils80].)  All  the 
knowledge  recorded  in  frames  that  are  linked  to  a  father  frame  by  the  AKO 
link  is  inherited  by  that  frame.  For  example,  both  the  RECTANGULAR- 
HOUSE  and  the  L-SHAPED-HOUSE  have  centroid,  shape-description,  front- 
of-house,  and  connecting-driveway  slots.  Also,  both  the  RECTANGULAR- 
HOUSE  and  the  L-SHAPED-HOUSE  can  use  rule  Firivtw(ly  to  compute  the 
connecting  driveway. 

Often,  the  HLVS  needs  to  reason  across  the  CAN-BE  link.  For  example, 
suppose  the  HLVS  needs  to  compute  the  shape  of  a  house.  The  HLVS  is  not 
able  to  do  the  computation  since  there  is  no  such  rule  recorded  in  the  HOUSE 
frame.  Instead,  the  HLVS  needs  to  reason  about  what  specialization  to  choose, 
i.e.,  RECTANGULAR-HOUSE  or  L-SHAPED-HOUSE.  The  strategies  for  this 
type  of  reasoning  are  called  specialization  strategies  and  are  encoded  as  rules 
and  recorded  in  frames.  Attaching  such  search  strategies  using  CAN-BE  links 
is  similar  to  the  process  of  “plan  elaboration”  in  Garvey’s  system  [Garv76] 

As  an  example,  suppose  that  there  are  two  type  of  houses,  rectangular 
and  L-shaped,  in  community  A.  Every  house  has  a  driveway.  However,  each 
type  of  house  has  a  different  appearance.  Suppose  Frectanjie  is  a  rule  which 
computes  the  shape  description  of  a  rectangular  house,  and  Fdriveway  is 
another  rule  which  finds  the  driveway  connecting  to  a  rectangular  house.  Rule 
F driveway  computes  the  driveway  of  a  house.  We  can  write  the  house  model  as 


shown  in  Figure  2-1.  In  this  model,  the  HOUSE  frame  is  a  generalization  of 
the  L-SHAPED-HOUSE  frame  and  the  RECTANGULAR-HOUSE  frame  while 
the  L-SHAPED-HOUSE  frame  and  RECTANGULAR-HOUSE  frame  are  spe¬ 
cializations  of  the  HOUSE  frame.  Their  hierarchical  relations  are  shown  in 
Figure  2-2. 

2.3.  Instantiation  of  a  frame 

Frames  are  the  prototypes  of  objects.  The  SIGMA  system  uses  frames  as 
models  to  construct  interpretations  of  the  image  by  making  instances  of 
frames.  An  instance  is  a  copy  of  a  frame.  The  process  of  making  instances  is 
called  instantiation.  At  instantiation,  values  can  be  assigned  to  slots.  These 
values  may  be  the  “defaults”  (specified  in  the  frame  definition)  or  may  be 
computed  using  rules.  Since  all  instances  are  recorded  in  the  iconic/symbolic 
database  in  the  HLVS  as  basic  database  entities,  we  use  the  term  Database 
Entities  (DE’s)  interchangeably  with  the  term  “instances”  in  the  rest  of  the 
paper. 

An  important  property  of  an  object  is  its  appearance.  During  the 
analysis,  the  HLVS  needs  to  direct  the  LLVS  (Low  Level  Vision  System)  to 
process  the  image  and  locate  image  structures  which  are  associated  with 
objects.  Some  objects’  appearances  are  defined  in  terms  of  image  structures 
that  can  be  directly  computed  by  the  LLVS.  Those  frames  which  define  such 
objects  are  called  primitive  frames.  Frames  which  are  not  primitive  are  called 
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non-primitive  frames. 


Depending  on  what  is  known  about  the  appearance  of  an  instance,  an 
instance  can  be  in  one  of  the  following  two  states:  verified,  which  indicates 
that  the  appearance  of  the  instance  is  some  already  located  image  structure  or 
is  a  function  of  the  appearances  of  verified  instances;  and  hypothetical,  which 
indicates  that  the  appearance  of  the  instance  has  not  been  determined. 

In  addition  to  the  appearances  of  objects,  the  HLVS  also  uses  the  iconic 
description  of  a  frame  during  its  reasoning.  The  iconic  description  specifies  an 
area  in  the  image  and  its  definition  is  specified  by  a  rule.  During  the 
hypotheses  integration,  the  HLVS  uses  the  iconic  descriptions  to  reason 
whether  two  DE’s  are  related  (explained  in  Section  3).  The  use  of  iconic 
description  in  SIGMA  is  similar  to  the  use  of  “functional  areas”  in  Mckeown’s 
SPAM  aerial  interpretation  system  [McKe84]. 

The  values  recorded  in  instances  may  be  updated  during  the  analysis. 
Every  instance  has  a  special  numerical  value  which  is  called  the  strength  of 
the  instance.  The  method  used  to  compute  strength  is  described  as  a  pro¬ 
cedure,  say  Pttrengtb  in  the  frame’s  definition.  Upon  instantiation,  a  strength 
is  computed  for  each  instance.  Whenever  the  values  recorded  in  an  instance 
are  updated,  the  strength  of  the  instance  is  also  recomputed  by  reevaluating 
^strength-  The  HLVS  uses  such  values  to  control  its  focus  of  attention  mechan¬ 
ism. 
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Suppose  one  defines  the  appearance  of  a  house  (house  frame)  as  a  rec¬ 
tangular  compact  region  and  a  row  of  houses  (house-group  frame)  as  the 
union  of  the  appearances  of  all  the  houses  in  a  house-group.  Then  the  house 
frame  is  primitive  while  the  house-group  frame  is  non-primitive.  In  SIGMA, 
in  order  to  locate  a  house-group,  the  HLVS  first  generates  hypotheses  about 
the  location  of  member  houses  and  then  direct  the  LLVS  to  locate  each  house 
individually. 

Now,  suppose  that  the  LLVS  located  a  rectangular  compact  region,  R0. 
The  HLVS  will  generate  a  house  instance,  H\,  whose  appearance  is  R0  and 
mark  it  as  a  verified  instance.  However,  suppose  the  HLVS  further  generates 
neighboring  house  predictions  for  Hv  say  H2  and  H3.  Both  H2  and  Hz  are 
hypothetical  instances  since  the  appearances  of  these  instances  have  not  yet 
been  determined  from  the  image. 

2.4.  Representing  relations  between  objects 

A  major  portion  of  the  scene  domain  knowledge  involves  relations 
between  objects.  However,  these  relations  must  be  represented  in  forms  that 
can  be  directly  used  by  the  HLVS.  Our  approach  is  influenced  by  production 
rules  [Davi77]  and  the  planning  paradigm  used  in  Garvey’s  vision  system 
[Garv76], 

Suppose  we  have  the  following  house-road  relation: 


A  road  road^  is  along  a  house  house0  if  the  predicate  along 
( roadQ,houseQ )  is  true. 

There  are  at  least  two  potential  uses  of  this  relation  by  the  HLVS: 

-  HLVS  uses  the  relation  to  check  whether  road  roadQ  is  along 
house  houseQ. 

-  HLVS  uses  the  relation  to  direct  a  search  for  a  road  along  house 
house0. 

In  order  to  support  multiple  uses  of  a  relation  by  the  HLVS,  we  use  a 
test-hypothesize-and-act  strategy  to  describe  relations.  A  binary  relation 
REL(0!,02)  between  objects  Ox  and  02  is  represented  using  two  functional 
descriptions: 

Ox  —  F(02 )  and  02  =  G(Ox). 

Program  F  computes  the  object  expected  by  object  02  and  is  recorded  in 
object  frame  02  as  a  rule.  Program  G  computes  the  object  expected  by  object 
Oj  and  is  recorded  in  object  frame  Ox  as  a  rule  also. 

As  noted  earlier,  control  knowledge  for  the  use  of  relations  and  control 
knowledge  for  directing  search  are  both  required  by  the  HLVS.  We  represent 
such  knowledge  as  predicates  associated  with  rules. 

We  present  our  rule  representation  scheme  as  follows: 


A  rule  is  composed  of  three  parts: 


<  control-condition  > 

<  hypothesis  > 

<  action  >. 


<  Control-condition >  is  a  predicate.  It  indicates  when  a  rule  can  potentially 
be  applied.  <Hypothesis>  specifies  the  description  of  a  desired  object  that  is 
created  when  the  < control-condition >  evaluates  to  true.  <Action> 
describes  the  code  to  be  evaluated  if  <  hypothesis  >  is  verified.  In  general, 
<action>  can  add  facts  to  or  delete  facts  from  the  iconic/symbolic  database 
of  the  HLVS. 

The  house-road  relation  can  be  written  as  a  rule  in  the  HOUSE  frame  as 
follows  (Figure  2-3): 


To  compute  a  road  along  house  kouse0 ,  we  always  generate  a  hy¬ 
pothesis  roadz  with  the  following  slot  values: 

road.orientation: 

greater  than  {house0. front-of-house  +  80  degrees)  but  less  than 
(AouseQ.front-of-house  100  degrees), 
road.width: 

greater  than  (Aouse0.width  *  0.3)  but  less  than  (/n?use0.width  * 
0.5). 

road.centroid: 

resides  within  REGION(/iouse0. centroid  -I-  T(/iouse0.front-of- 
house)). 

T(.)  is  a  function.  If  the  hypothesis  roadx  is  verified  by  some  road 
roach,  then  road  road 0  is  along  house  house0. 


Figure  2-4  shows  a  model  for  suburban  housing  developments.  Objects 
are  described  by  nodes  (square)  and  relations  are  described  by  arcs.  In  this 
model,  Rectangle  and  Picture-Boundary  are  the  “primitive  frames”. 

The  HLVS  makes  use  of  the  different  parts  of  a  rule  to  perform  its  rea¬ 
soning.  We  discuss  this  in  Section  4. 


3.  Integration  of  hypotheses 


3.1.  Introduction 

Consider  a  binary  relation  REL(01,02)  between  two  classes  of  objects, 
0X  and  02.  This  relation  can  be  used  as  a  constraint  to  recognize  objects  from 
these  two  classes  by  first  extracting  image  structures  which  satisfy  the 
specified  appearances  of  Ox  and  02,  and  then  checking  that  the  relation  is 
satisfied  by  these  candidate  objects  (Figure  3-1).  In  this  bottom-up  recognition 
scheme,  analysis  based  on  relations  cannot  be  performed  until  image  struc¬ 
tures  corresponding  to  objects  are  extracted. 

In  general,  however,  some  of  the  correct  image  structures  fail  to  be 
extracted  by  the  initial  image  segmentation.  So  one  must,  additionally,  incor¬ 
porate  top-down  control  to  find  image  structures  missed  by  the  initial  segmen¬ 
tation.  Such  top-down  processes  use  relations  to  predict  the  locations  of 
missing  objects,  as  in  the  system  described  by  (Garvey  [Garv76],  Selfridge 
[Self82]) 

As  noted  above,  the  use  of  relations  is  very  different  in  the  two  analysis 
processes  :  consistency  verification  in  bottom-up  analysis  and  hypothesis  gen¬ 
eration  in  top-down  analysis.  An  important  characteristic  of  our  hypothesis 
integration  method  is  that  it  enables  the  system  to  integrate  both  bottom-up 
and  top-down  processes  into  a  single  flexible  spatial  reasoning  process. 


As  will  be  described  in  Section  4,  the  HLVS  first  establishes  local  environ¬ 
ments.  Then,  either  bottom-up  or  top-down  processes  are  activated  depending 
on  the  nature  of  the  local  environment.  The  following  sections  describe  the 
concepts  and  characteristics  of  this  process. 

3.2.  The  representation  of  database  entities 

All  instances,  hypothetical  or  verified,  generated  by  the  HLVS  are 
recorded  in  a  database.  In  the  rest  of  this  section,  we  use  the  term  database 
entity  (DE)  to  refer  to  instances  recorded  in  the  database.  In  addition,  we  use 
the  term  hypothesis  to  refer  to  instances  in  the  hypothetical  state. 

The  description  of  each  DE  consists  of  two  parts.  One  part  is  the  iconic 
description.  This  description  is  a  region  in  the  image  which  indicates  where 
the  DE  may  be  located.  It  is  generated  by  the  rule  which  specifies  the  iconic 
description  of  the  frame  used  to  generate  the  DE. 

The  second  part  is  the  symbolic  description,  which  includes  the  values 
filled  into  the  slots  of  the  DE,  and  the  set  of  constraints  imposed  on  these 
values.  These  constraints  are  represented  by  a  set  of  linear  inequalities  in  one 
variable  (the  slot  name). 

3.3.  Consistency  between  a  pair  of  DE’s 

“Related”  DE’s  are  integrated  and  analyzed  together.  In  SIGMA,  “relat¬ 
edness”  between  DE’s  is  defined  in  terms  of  “consistency”  between  pairs  of 
DE's.  A  pair  of  DE’s,  DEX  and  DE%,  are  said  to  be  consistent  if  the  following 


conditions  hold: 


1)  The  iconic  descriptions  of  the  DE’s  must  intersect.  It  is  also  possible  to 
impose  some  requirements  on  the  size  and  shape  of  the  area  of  intersection. 


2)  The  DE’s  are  compatible.  Let  OP  be  the  intersection  arising  from  two 
DE’s,  and  let  Fx  and  F2  denote  the  frames  from  which  DEl  and  DE2  were 
copied.  DEX  and  DE2  are  said  to  be  compatible  if  Fx  and  F2  are  linked  by 
CAN-BE  or  AKO  links.  Otherwise,  DEX  and  DE2  are  said  to  be  incompatible. 
This  will  be  explained  in  more  detail  in  Section  3.5. 


3)  The  constraints  imposed  on  the  attributes  of  the  DE’s  must  be  satisfiable. 
Every  DE  has  associated  with  it  a  set  of  linear  inequalities  in  one  variable 
that  constrain  the  permissible  values  of  the  DE’s  attributes.  A  simple  con¬ 
straint  manipulation  system  is  used  to  check  the  consistency  between  the  sets 
of  inequalities  by  generating  the  solution  space  (also  represented  by  inequali¬ 
ties)  to  the  intersection  of  those  sets.  If  this  solution  space  is  non-empty,  then 
the  constraints  are  consistent. 


3.4.  Formation  of  maximum  consistent  situations 

Consistent  DE’s  are  combined  into  situations.  These  DE’s  are  said  to 
participate  in  the  formation  of  a  situation.  The  P-set  of  a  situation  is  its  set 
of  participating  DE’s.  Situation  Sa  is  less  than  situation  Sj  if  the  P-set  of  Sa 
is  a  subset  of  the  P-set  of  5^.  This  ordering  is  used  to  structure  all  the  situa¬ 
tions  into  a  situation  lattice.  Note  that  a  single  DE  is  also  a  situation.  The 
rest  of  this  section  presents  the  algorithm  used  to  form  situations. 

Two  DE’s  are  said  to  be  2-consistent  if  they  are  consistent.  In  general,  a 
set  of  DE’s  is  said  to  be  n-consistent  if  every  possible  subset  of  (n-1)  of  the 
DE’s  is  (n-l)-consistent.  Clearly,  a  set  of  DE’s  is  n-consistent  if  and  only  if 
all  possible  pairs  of  DE’s  in  the  set  are  2-consistent. 
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When  a  DE,  say  DEnew,  is  inserted  into  the  iconic/symbolic  database,  the 
current  situation-lattice  is  updated  by  first  computing  the  set,  U,  that  con¬ 
tains  all  DE’s  whose  iconic  descriptions  intersect  with  the  iconic  description  of 
DEnew.  Then,  we  iteratively  compute  all  lists  of  n-consistent  DE’s  for  those 
DE’s  in  the  set  U.  Each  such  list  of  n-consistent  DE’s  forms  the  P-set  of  some 
situation.  Algorithm  3-1  describes  this  process. 

The  maximum  consistent  situations  are  those  situations  which  are  the 
roots  of  the  situation  lattice. 

Algorithm  3-1  :  Updating  the  Situation  Lattice 

Step  1:  Suppose  the  newly  inserted  entity  is  DEnew.  Compute  the  set  U. 

N=2. 

Step  2:  Compute  the  set,  R,  of  all  the  N-consistent  DE’s  for  the  DE’s  in 

U.  Remove  any  which  do  not  contain  DEntw. 

Step  3:  If  R  is  empty,  then  exit.  Otherwise,  insert  all  the  elements  of  R 

into  the  situation-lattice. 

Step  4:  Increment  N  by  1.  Construct  all  the  pairs  for  elements  in  R. 

Represent  each  pair  by  the  union  of  the  members  in  each  ele¬ 
ment.  Remove  any  which  is  not  N-consistent  or  does  not  contain 
DEnew.  Set  R  to  be  the  set  of  resulting  N-consistent  DE’s. 


Figure  3-2  shows  an  example  of  how  the  situation  lattice  is  updated  when 
a  DE  is  inserted.  Each  DE  is  represented  by  a  letter.  A  situation  is 
represented  by  all  the  DE’s  in  its  P-set.  Figure  3-2(a)  shows  the  situation  lat- 
tice  before  the  insertion  of  DEe  and  the  iconic  descriptions  of  the  DE’s.  Sup¬ 
pose  that  the  new  DE,  DEe >  "IS  consistent  with  DEA,  DEg  and  DEd.  The  set 
U  would  then  include 

DEa,  DEg,  DEC,  DEg,  DEe • 

The  first  time  that  step  3  is  evaluated,  set  R  contains  the  following  situations: 
DEae,  DEbe,  DEde. 

The  second  time  that  step  3  is  evaluated,  set  R  contains  the  following  situa¬ 
tion: 

DEade 

The  updating  stops  at  the  third  iteration.  Figure  3-2(b)  shows  the  situation 
lattice  after  the  updating  process. 

When  a  DE,  say  DEremove,  is  being  removed  from  the  iconic/symbolic 
database,  the  current  situation  lattice  must  also  be  updated.  This  can  be  done 
simply  by  removing  all  the  situations  in  the  situation  lattice  which  are  larger 
than  DEremovr 


Suppose,  for  example,  that  DEA  is  removed  from  the  situation  described 
in  Figure  3-2(b).  Figure  3-3  shows  the  resulting  situation  lattice. 

It  is  possible  that  the  number  of  situations  in  the  situation  lattice  may 
grow  exponentially.  In  practice,  this  does  not  happen  since  the  number  of 
participants  in  a  situation  is  usually  quite  small,  e.g.,  two  or  three. 

3.5.  Constructing  the  composite  hypothesis 

A  situation  is  a  collection  of  consistent  DE’s.  The  HLVS  selects  a  situa¬ 
tion  and  proposes  a  composite  hypothesis  which  “summarizes”  the  constraints 
imposed  on  the  attributes  of  all  the  participating  DE’s.  The  strategy  for  com¬ 
puting  the  composite  hypothesis  is  specified  by  a  procedure  recorded  in  the 
frame’s  definition.  (Note  that  two  DE’s  are  consistent  only  if  they  are 
instances  of  the  same  frame  or  instances  of  frames  in  the  same 
generalization/specialization  hierarchy.  Therefore,  all  the  participants  in  a 
situation  must  be  instances  of  frames  in  the  same  generalization/specialization 
hierarchy.  The  procedure  for  computing  the  composite  hypothesis  is  recorded 
in  the  most  general  frame.)  This  section  presents  some  strategies  for  comput¬ 
ing  the  composite  hypothesis. 

One  simple  strategy  is  to  use  the  solution  sets  of  all  the  constraints 
imposed  on  the  attributes  of  all  the  participating  DE’s  (explained  in  Section 
3.4)  as  the  constraint  set  of  the  composite  hypothesis.  The  target  object  of 
the  composite  hypothesis  is  the  most  specialized  object  expected  by  all  the 


DE’s. 


Suppose  that  the  constraint  set  of  DEl  is 


target  object  =  HOUSE, 
house. centroid  =  (100,130), 
230  <  house.area  <  300 


while  the  constraint  set  of  DE2  is 


target  object  =  RECTANGULAR-HOUSE, 
house.centroid  =  (100,130), 

250  <  house.area  <  320, 
house.region-contrast  >  3. 


Using  this  method,  we  generate  the  composite  hypothesis  for  DEX  and  DE2  as 
follows: 


target  object  =  RECTANGULAR-HOUSE, 
house.centroid  =  (100,130), 

250  <  house.area  <  300, 
house.region-contrast  >  3. 


Another  strategy  is  to  take  the  union  of  all  the  solution  sets  of  the  constraints 
imposed  on  the  attributes  of  all  the  participating  DE’s.  Suppose,  for  example, 
that  two  hypotheses,  DEX  and  DE2,  about  a  road  have  constraints  on  their 
starting  and  ending  points  as  follows: 
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hypothesis  DEV 
target  object  =  road, 

road.end-points  =  ((100,100), (100, 150)] 


hypothesis  DE2, 
target  object  =  road, 

road.end-points  =  ( (100, 125), (100, 180)] 


We  may  want  to  construct  a  road  hypothesis  whose  constraint  set  is  the  union 
of  these  constraints  on  DEX  and  DE2: 

target  object  =  road, 
road.end-points  =  ( (100, 100), (100, 180)1 . 
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4.  An  implementation  of  SIGMA 

4.1.  Overview 

The  goal  of  SIGMA  is  to  segment  the  image  into  image  structures  which 
correspond  to  the  objects  specified  in  the  object  model.  Section  1.3  outlined 
the  architecture  of  the  SIGMA  image  understanding  system.  This  section 
describes  its  implementation. 

Figure  4-1  illustrates  the  different  stages  of  the  control  of  SIGMA. 
SIGMA  first  directs  the  LLVS  to  perform  an  initial  segmentation  of  the 
image.  A  set  of  image  structures  are  computed  at  this  stage.  At  the  second 
stage,  the  HLVS  constructs  partial  interpretations  based  on  the  results  of  the 
initial  segmentation.  However,  during  the  construction,  the  HLVS  may  direct 
the  LLVS  to  compute  more  image  structures.  When  all  construction  activities 
finish,  SIGMA  provides  a  query-answering  module  for  selecting  “good 
interpretations”  and  displaying  the  reasoning  paths  used  to  derive  these 
interpretations.  During  the  entire  analysis,  SIGMA  maintains  a  database 
(the  iconic/symbolic  database)  to  record  all  the  intermediate  results  gen¬ 
erated  at  each  stage. 

The  res*  of  this  section  discusses  the  implementation  of  SIGMA. 

4.2.  Description  of  goals 

The  Query-Answering  Module  (QAM)  is  activated  by  the  HLVS  at  the 
end  of  each  reasoning  iteration.  The  goal  of  SIGMA  is  described  as  a  query  to 


QAM.  QAM  matches  the  query  with  the  interpretations  already  constructed. 
If  any  interpretation  matches  the  goal,  QAM  enters  into  an  answer  mode  and 
provides  an  interactive  query-answering  capability. 

Suppose,  for  example,  that  the  goal  is  to  locate  any  road  whose  length  is 
longer  than  300  feet  in  the  image  and  has  at  least  two  houses  along  it.  This 
goal  can  be  represented  by  the  following  query: 

road(x)  and  (x.length  >  300  feet)  and  (x.number-of- houses  >2). 

During  the  interpretation  stage,  whenever  a  road  instance  is  constructed 
whose  length  is  longer  than  300  feet  and  has  at  least  two  houses  along  it  (i.e., 
x  is  bound  to  some  interpretation  constructed  by  the  HLVS),  QAM  will  enter 
an  answer  mode  and  make  the  specific  road  instance  that  satisfies  the  goal 
available  to  an  interactive  program.  One  can  use  this  program  to  traverse  the 
interpretation  network  (the  network  which  is  constructed  by  the  HLVS  during 
the  interpretation  process),  and  display  symbolic  and  iconic  descriptions  of  the 
interpretations  constructed. 

4.3.  The  initial  segmentation 

SIGMA  starts  its  processing  by  directing  the  LLVS  to  extract  image 
structures.  The  schematic  diagram  of  the  initial  segmentation  process  is 
shown  in  Figure  4-2.  The  set,  T,  which  contains  a  list  of  hypotheses  about 
primitive  objects,  is  used  to  describe  the  goal  of  the  initial  segmentation  pro- 


The  Initial  Segmentation  Controller  (ISC)  sequentially  selects  hypotheses 
from  the  set  I  and  directs  the  LLVS  to  extract  image  primitives  which  satisfy 
these  hypotheses.  For  each  image  primitive  extracted,  the  ISC  makes  an 
instance  of  the  frame  of  which  the  hypothesis  is  a  copy,  and  then  inserts  the 
instance  created  into  the  iconic/symbolic  database. 

Suppose,  for  example,  that  we  want  to  first  extract  all  regions  which 
might  correspond  to  house  groups  and  roads  in  the  image.  A  set  which  con¬ 
tains  the  following  hypotheses  can  be  used  as  the  set  I: 

hypothesis  1:  /*  extract  compact  and  bright  rectangles  */ 
target  object  ==  rectangle, 
in-window  =  whole  image, 
rectangle.elongatedness  <  10, 
rectangle.compactness  <  18, 
rectangle.region-contrast  >  3, 

180  <  rectangle. area-of  <  400. 

hypothesis  2:  /*  extract  elongated  rectangles  */ 
target  object  =  rectangle, 
in-window  =  whole  image, 

7  <  rectangle.width  <  20, 
rectangle.elongatedness  >  10, 
rectangle. length  >  10, 
rectangle.compactness  >  18, 
rectangle.region-contrast  >  3. 


The  set  I  for  the  initial  segmentation  could,  in  principle,  be  computed 
from  the  scene  model,  since  the  appearances  of  objects  are  described  in  terms 
of  the  appearances  of  “primitive  frames”.  The  ISC  could  choose  those  primi- 


tive  frames  whose  appearances  are  salient  (i.e.,  they  can  be  located  “easily” 
by  the  LLVS)  as  the  I-set.  However,  this  was  not  implemented  in  SIGMA;  the 
I-set  is  simply  given  as  part  of  the  scene  model. 


4.4.  Construction  of  partial  interpretations 

The  schematic  diagram  of  the  processing  involved  in  constructing  partial 
interpretations  is  shown  in  Figure  4-3.  The  HLVS  iterates  the  following  steps 
in  this  stage: 

(1)  hypothesis  generation, 

(2)  focus  of  attention, 

(3)  composite  hypothesis  construction, 

(4)  solution  generation, 

(5)  action  scheduling. 

Detailed  discussions  of  each  step  are  presented  in  the  following  subsections. 

4.4.1.  Hypothesis  generation 

For  each  DE  (hypothetical  or  verified)  recorded  in  the  iconic/symbolic 
database,  the  Iconic/Symbolic  Database  Manager  (ISDM)  evaluates  all  the 
rules  that  are  “applicable”. 

Suppose  I0  is  an  instance  of  frame  F.  For  each  rule,  say  Rx,  defined  in 
frame  F ,  the  ISDM  evaluates  the  < control-condition >  part  of  rule  Rz.  If  the 
evaluation  result  is  true,  the  ISDM  performs  the  following  tasks: 
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(1)  Compute  the  <hypothesis>  part  of  rule  Rx,  and  insert  the 
computed  hypothesis  into  the  iconic/symbolic  database. 

(2)  Insert  the  <action>  part  of  rule  Rz  into  the  Action  List  which 
records  all  the  actions  waiting  to  be  evaluated. 

The  actions  in  the  action  list  are  called  delayed  actions.  For  each 
delayed  action,  there  is  an  associated  hypothesis  (computed  at  step  1) 
recorded  in  the  iconic/symbolic  database.  Such  a  hypothesis  is  called  the 
cause  of  delay  of  the  action. 

Note  that  for  rules  whose  <  hypothesis >  part  is  nil,  the  <  action >  part 
is  not  put  into  the  action  list.  Instead,  the  <  action  >  is  evaluated  immedi¬ 
ately.  At  the  hypothesis  generation  stage,  the  ISDM  evaluates,  for  each 
instance  in  the  iconic/symbolic  database,  the  Ccontrol  condition>  of  every 
rule  in  the  associated  frame  definition.  (This  strategy  is  not  efficient.  A  more 
efficient  strategy  would  evaluate  only  those  < control  condition>  s  whose 
values  are  affected  by  changes  made  to  the  attributes  of  the  instance  since  the 
last  time  the  <  control  condition  >s  were  evaluated.) 

The  DE’s  in  the  iconic/symbolic  database  are  combined  into  situations. 
All  the  situations  are  structured  into  the  situation  lattice.  The  Situation  Lat¬ 
tice  Database  Manager  (SLDM)  updates  the  situation  lattice  whenever  DE’s 
are  inserted  into  or  removed  from  the  iconic/symbolic  database.  The  algo¬ 
rithm  (3-1)  for  updating  the  situation  lattice  was  presented  in  Section  3.4. 
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“Identical  instances”  may  be  created  during  the  construction  process  of 
the  HLVS.  Two  instances  are  identical  if  all  the  values  filled  in  the  slots  of 


those  instances  are  identical.  It  is  necessary  to  detect  identical  instances  and 
replace  them  by  a  single  instance.  This  process  is  called  unification  of 
instances,  and  is  performed  during  construction  of  composite  hypotheses. 

For  example,  a  house  group  instance  containing  house  instances  H0  and 
Hx  can  be  constructed  from  instance  H0  by  first  constructing  a  house  group 
instance,  say  HG0,  which  contains  HQ  ,  and  then  expanding  HG0  to  include 
house  instance  Hx  (see  Figure  4-4(a)).  An  identical  house  group  instance  HGX 
can  also  be  constructed  from  house  instance  Hx  (see  Figure  4-4(b)). 

One  natural  way  to  detect  identical  instances  is  to  examine  the  P-set  of  a 
situation.  For  each  situation  selected  by  the  focus  of  attention  mechanism,  the 
HLVS  examines  the  instances  in  the  P-set  of  the  situation  to  find  sets  of 
identical  instances. 

The  HLVS  unifies  identical  instances  as  follows.  All  identical  instances 
are  first  collected  in  a  set,  L.  Then  the  HLVS  selects  one  instance  from  the  set 
L,  say  /0.  For  each  instance  lx  6  L,  the  HLVS  replaces  every  reference  to  Iz  in 
the  iconic/symbolic  database  by  a  reference  to  instance  /0. 

Figure  4-5  illustrates  the  result  of  unifying  HG0  and  HGX  (assuming  the 
HLVS  chooses  HG0  as  /0). 
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4.4.2.  Focus  of  attention 

The  focus  of  attention  mechanism  selects  a  situation  with  greatest 
strength  from  the  situation  lattice.  If  there  are  several  situations  with  equal 
strength,  the  HLVS  selects  one  arbitrarily. 

For  example,  Figure  4-6  shows  a  situation  lattice.  There  are  two  maximal 
consistent  situations  that  can  be  selected  (both  situations  have  strength  =  3). 
The  HLVS  can  select  either  one  (i.e.,  Nl0,  or  Nn). 

The  situation  selected  by  the  focus  of  attention  mechanism  is  given  to 
the  Composite  Hypothesis  Constructor  to  construct  the  composite  hypothesis. 
The  construction  of  composite  hypotheses  was  discussed  in  Section  3.5. 

4.4.3.  Solution  generation 

The  Solution  Generator  (SG)  computes  solutions  for  the  composite 
hypothesis.  The  SG  obtains/constructs  instances  to  satisfy  the  composite 
hypothesis  by  one  of  the  methods  discussed  in  the  following  paragraphs. 

First,  the  SG  may  discover  an  existing  instance  in  the  iconic/symbolic 
database  that  satisfies  the  composite  hypothesis.  In  this  case,  the  SG  returns 
the  instance  found  as  the  solution.  In  general,  it  may  be  necessary  to  search 
the  iconic/symbolic  database  to  find  some  instance  which  satisfies  the  compo¬ 
site  hypothesis.  However,  since  the  composite  hypothesis  is  constructed  by 
taking  the  solution  space  of  all  the  constraints  imposed  on  the  DE’s  partici¬ 
pating  in  the  situation  (see  Section  3.5),  to  find  an  existing  instance  which 


-33- 


satisfies  the  composite  hypothesis,  the  SG  needs  only  examine  the  P-set  of  the 
selected  situation  and  use  any  instance  in  the  P-set  as  the  solution. 

Suppose  the  SG  cannot  find  any  instance  in  the  iconic/symbolic  database 
that  satisfies  the  composite  hypothesis.  There  are  two  possibilities: 

(1)  the  target  object  of  the  composite  hypothesis  is  a  primitive  ob¬ 
ject  (such  hypotheses  are  called  primitive  hypotheses); 

(2)  the  target  object  of  the  composite  hypothesis  is  not  a  primitive 
object  (such  hypotheses  are  called  non-primitive  hypotheses). 

In  the  first  case,  the  SG  first  directs  a  top-down  segmentation  by  provid¬ 
ing  to  the  LLVS  the  descriptions  of  the  composite  hypothesis.  Then  the  SG 
creates  instances  based  on  the  results  of  the  LLVS.  Finally,  the  instances 
created  (if  any)  are  returned  as  a  solution. 

In  the  second  case,  no  top-down  segmentation  is  performed.  The  SG 
simply  returns  the  composite  hypothesis  as  the  solution. 

4.4.4.  Action  scheduling 

The  Action  Scheduler  (AS)  schedules  the  actions  in  the  action  list  using 
the  solution  provided  by  the  SG.  Three  possible  types  of  solutions  may  be 
provided: 


(1)  nil,i.e.,  the  hypothesis  cannot  be  verified, 

(2)  an  instance, 

(3)  a  composite  hypothesis. 


In  both  the  first  and  the  second  cases,  the  AS  selects  those  <  action  >s  in 
the  action  list  whose  “causes  of  delay”  are  in  the  P-set  of  the  selected  situa¬ 
tion.  Let  the  solution  be  /0,  the  actions  selected  be  Ax,  .  .  .  ,An,  and  their 
causes  of  delay  be  Hv  .  .  .  ,Hn,  respectively.  The  AS  performs  the  selected 
actions  sequentially: 


(a)  replace  all  the  references  to  H ,  in  action  A,  by  I0, 

(b)  evaluate  A,-, 

(c)  remove  H{  from  the  iconic/symbolic  database,  or  update  the 
attributes  of  //,  (we  will  discuss  this  in  more  detail  in  Section  4.5). 


In  the  third  case,  the  AS  marks  the  composite  hypothesis,  say  CH0,  as 
partially  processed  and  inserts  it  into  the  iconic/symbolic  database.  The  AS 
also  marks  the  currently  selected  situation,  say  S0,  as  unconcluded.  The 
hypothesis  CH0  is  said  to  be  derived  from  the  situation  S0.  We  will  present  a 
more  detailed  discussion  of  the  effects  of  such  processing  in  Section  4.4.4. 1. 
Table  4-1  summarizes  the  terms  defined  in  the  previous  paragraphs. 


The  removal  of  hypotheses  from  the  iconic/symbolic  database  has  the 
following  side  effects: 

(1)  If  a  hypothesis,  say  H0,  is  removed  from  the  database,  then  all  the 


Table  4-1.  Glossary. 


Primitive  hypothesis: 

A  hypothesis  whose  target  object  is  a  primitive  object. 

Non-primitive  hypothesis: 

A  hypothesis  whose  target  object  is  a  non-primitive  object. 
Unconcluded  situation: 

A  situation  which  was  selected  by  the  focus  of  attention  mechanism, 
but  for  which  the  Solution  Generator  cannot  yet  compute  a  solution. 

Partially  processed  hypothesis: 

A  composite  hypothesis,  recorded  in  the  iconic/symbolic  database, 
which  is  computed  for  some  unconcluded  situation. 


situations  in  the  situation  lattice  whose  P-sets  contain  H0  are  also  removed 
from  the  situation  lattice. 

(2)  If  an  unconcluded  situation  is  removed  from  the  situation  lattice  in  (1), 
then  the  hypotheses  which  were  derived  from  the  situation  are  also  removed 
from  the  iconic/symbolic  database. 

The  updating  of  attributes  of  hypotheses  is  implemented  by  removing  the 
original  hypothesis  and  inserting  a  new  hypothesis. 

When  all  the  actions  selected  are  evaluated,  the  action  scheduler  ter¬ 


minates,  and  the  next  cycle  of  hypothesis  construction  begins. 


4.4.4. 1.  Computing  solutions  for  a  non-primitive  composite 

hypothesis 

The  SG  does  not  directly  propose  solutions  for  a  non-primitive  composite 
hypothesis.  Instead,  a  top-down  parsing  approach  is  used  to  compute  the 
solution.  Suppose  the  composite  hypothesis  constructed  for  a  situation  ,  say 
S0,  is  CHa.  To  compute  the  solution  for  CHa,  we  first  generate  a  set  of 
hypotheses  Hiy\<i<n  and  compute  the  solution  for  each  The  solution  for 
CHa  can  be  computed  from  the  solutions  for  Hiyl<i<n. 

To  support  such  an  approach,  we  associate  with  each  non-primitive 
frame  a  decomposition  strategy  (represented  as  a  rule)  which  describes  how  to 
generate  a  new  set  of  hypotheses  to  be  verified,  and  how  to  compute  a  solu¬ 
tion  for  the  non-primitive  frame  using  the  solutions  for  the  generated 
hypotheses. 

For  example,  the  rule  for  the  decomposition  strategy  of  a 
RECTANGULAR-HOUSE  frame  is 

Rule  R fint-order-propertier 

<  control-condition >  :  true, 

<  hypothesis  >  : 

H  =  FQ{RECT ANGLE, self), 

<  action  >  : 

if  tf=nil  then  conclude(nil) 

else  conclude(make-instance(RECTANGULAR-HOUSE,//))- 


This  rule  indicates  that  a  RECTANGULAR-HOUSE  instance  can  be  created 


if  a  RECTANGLE  instance  which  satisfies  the  attributes  specified  by  F0  is 
created. 

As  discussed  in  Section  4.4.4,  the  Action  Scheduler  (AS)  marks  the  non¬ 
primitive  composite  hypothesis  as  partially  processed  and  inserts  it  into  the 
iconic/symbolic  database.  The  AS  also  marks  the  situation  selected  as  uncon¬ 
cluded.  Partially  processed  hypotheses  and  unconcluded  situations  are  pro¬ 
cessed  by  other  modules  of  the  HLVS  in  the  following  ways: 

(1)  If  a  situation,  say  5,  is  marked  as  “unconcluded",  then  all  the  situations  in 
the  situation  lattice  which  are  less  than  S  are  also  marked  as  unconcluded. 
The  focus  of  attention  mechanism  does  not  select  any  unconcluded  situation. 
This  strategy  is  based  on  the  observation  that  if  no  conclusion  can  be  drawn 
from  the  analysis  of  a  situation,  say  5,  then  the  analysis  of  all  the  situations 
which  are  “less  than”  5  (i.e.,  composed  of  a  subset  of  the  instances  of  S)  can 
be  postponed. 

For  example,  by  marking  situation  Nl0  in  Figure  4-6  as  unconcluded,  all 
the  situations  that  are  less  than  Nl0  are  also  marked  as  unconcluded  (i.e., 
NitHitl<i<3). 

(2)  The  function  “conclude”  indicates  that  a  solution,  say  Iaoi,  has  been  com¬ 
puted  for  an  unconcluded  situation,  say  S.  Whenever  this  function  is 
evaluated,  the  HLVS  schedules  S  as  the  situation  to  be  selected  in  the  next 
iteration  cycle  and  the  solution  proposed  for  the  composite  hypothesis  of  this 
situation  is  Iaot. 

(3)  Since  a  partially  processed  hypothesis,  say  H,  is  the  composite  hypothesis 
constructed  for  some  unconcluded  situation,  5,  H  should  not  participate  in  the 
formation  of  new  situations  with  any  DE’s  in  the  P-set  of  S.  HLVS  uses  the 
more  efficient  strategy  of  not  allowing  a  partially  processed  hypotheses  to  par¬ 
ticipate  in  the  formation  of  any  situations. 

(4)  In  the  hypothesis  generation  process,  only  the  rules  which  describe  the 
decomposition  strategy  can  be  evaluated  for  partially  processed  hypotheses. 


All  the  hypotheses  generated  are  inserted  into  the  iconic/symbolic  database. 

(5)  The  removal  of  a  partially  processed  hypothesis  from  the  iconic/symbolic 
database  causes  the  removal  of  all  the  hypotheses  in  the  database  which  are 
generated  by  the  decomposition  strategy. 

Suppose,  for  example,  that  the  situation  Nl0  shown  in  Figure  4-6  is 
selected  by  the  focus  of  attention  mechanism  and  the  composite  hypothesis 
constructed,  say  CHa,  is: 

target  object  :  RECTANGULAR-HOUSE; 

Since  RECTANGULAR-HOUSE  is  not  a  primitive  frame,  the  SG  returns  CHa 
as  the  solution  to  the  situation  A^q.  The  AS  marks  Aj0  as  unconcluded  and 
inserts  CHa  into  the  iconic/symbolic  database. 

At  the  subsequent  hypothesis  generation  process,  CHa  activates  the  rule 
Rfirat-order-propertie»  ln  the  RECTANGULAR-HOUSE  frame  and  creates 
hypothesis  Hg: 

target  object  :  RECTANGLE; 

Figure  4-7  shows  the  relation  between  CHa  and  f/9  and  the  action  which  is 
delayed  by  H$.  The  resulting  situation  lattice  is  shown  in  Figure  4-8. 

Suppose  a  RECTANGLE  instance,  say  IR,  is  proposed  to  Hg  by  the  SG. 
The  AS  evaluates  the  action  whose  cause  of  delay  is  Hg  and: 


(1)  creates  a  RECTANGULAR-HOUSE  instance,  say  IRR, 


(2)  evaluates  the  function  “conclude”. 

The  evaluation  of  the  function  “conclude”  indicates  to  the  HLVS  that  situa¬ 
tion  Nl0  is  to  be  scheduled  in  the  next  iteration  cycle  and  the  solution  pro¬ 
posed  for  CHa  is  IRfj. 

At  the  next  iteration,  the  SG  proposes  IRR  to  the  hypotheses  in  the  P-set 
of  Nl0  (i.e.,  Hi,  H2,  Hz).  Those  actions  whose  causes  of  delay  are  Hv  H2,  and 
H3  are  now  evaluated  by  the  Action  Scheduler.  Suppose  Hv  H2,  and  Hz  are 
removed  after  the  evaluation  of  these  actions.  Figure  4-9  shows  the  resulting 
situation  lattice.  Note  that  this  is  usually  the  case  when  an  appropriate  solu¬ 
tion  is  proposed  to  the  hypotheses. 

The  processing  of  partially  processed  hypotheses  and  unconcluded  situa¬ 
tions  are  summarized  in  Table  4-2. 

4.5.  A  taxonomy  of  actions 

In  this  section,  we  discuss  a  taxonomy  of  the  actions  that  are  often  used 
to  specify  the  scene  domain  knowledge.  The  term  action  in  this  section  refers 
to  the  activities  described  in  the  <action>  part  of  a  rule. 

One  type  of  action  is  the  filling  in  of  attributes  of  an  instance.  For 
example,  a  rule  in  the  HOUSE-GROUP  frame  is: 


Table  4-2.  Summary. 


Unconcluded  situation: 

-  Will  not  be  selected  by  the  focus  of  attention  mechanism. 

-  If  a  solution  is  proposed  by  the  SG  for  some  unconcluded  situation, 
the  HLVS  schedules  that  situation  in  the  next  iteration  cycle. 

Partially  processed  hypothesis: 

-  A  composite  hypothesis  for  some  unconcluded  situation. 

-  Recorded  in  the  iconic /symbolic  database. 

-  Does  not  participate  in  the  formation  of  any  situations. 

-  Removal  of  a  partially  processed  hypothesis,  H,  causes  the  removal  of 
all  the  hypotheses  generated  by  H. 


<  control-condition  >  :  true 

<  hypothesis  >  :  H  =*=  AR(self,ROAD), 

<  action  >  :  self. along- road  =  H. 

This  rule  specifies  that  if  a  ROAD  instance  which  satisfies  H  is  found,  fill  it  in 
the  slot  “along-road”  of  the  HOUSE-GROUP  instance. 

In  addition  to  filling  in  attributes,  actions  often  create  new  instances  or 
unify  several  instances  (as  described  in  Section  4.4.1).  Such  actions  are 
described  by  two  functions: 

-  “make-instance"  :  create  an  instance  and  insert  it  into  the  iconic/symbolic 
database; 


-  “unify-instance”  :  unify  a  list  of  instances  in  the  iconic/symbolic  database 
into  a  single  instance. 

For  example,  a  rule  in  the  RECTANGLE  frame  is: 

<  control-condition  >  :  IS-RECT-HOUSE(self) 

<  hypothesis >  :  nil, 

<  action  >  : 

make-instance(RECTANGULAR-HOUSE,F(self)). 

This  rule  describes  the  following  piece  of  knowledge: 

“If  a  RECTANGLE  instance  which  satisfies  the  IS-RECT-HOUSE  criteria  is 
created,  then  create  a  RECTANGULAR-HOUSE  instance  using  function  F 
and  insert  it  into  the  iconic/symbolic  database.” 


Similarly,  the  following  piece  of  knowledge: 


“If  more  than  one  HOUSE-GROUP  instance  is  filled  in  the  “belongs-to”  slot 
of  a  HOUSE  instance,  replace  it  by  another  HOUSE-GROUP  instance  which 
is  created  by  the  function  COMBINE-H.” 

can  be  described  by  the  following  rule  in  the  HOUSE  frame: 


<  control-condition  >  : 

if  number-of-elements(self.belongs-to)  >  1, 

<  hypothesis  >  :  nil, 

<  action  >  : 

unify-instancefself. belongs-to, COMBINE-H(self.belongs- 


Another  class  of  actions  deals  with  the  removal  of  hypotheses  and  the 
updating  of  the  attributes  of  hypotheses.  Usually,  hypotheses  are  removed  by 
the  Action  Scheduler  after  the  Solution  Generator  proposes  solutions  to  them. 
However,  instead  of  always  removing  hypotheses  when  no  acceptable  solution 
is  found,  we  may  want  to  update  the  attributes  of  the  original  hypotheses 
when  more  information  is  available.  The  function  “update”  is  used  to  describe 
the  updating  of  the  attributes  of  a  hypothesis. 

For  example,  consider  the  following  rule: 

<  control-condition  >  :  ... 

<  hypothesis  >  :  H  =  F(self) 

<  action  >  : 

if  H  =  nil  then  update(/f,C'51) 
else  ... 

The  action  specifies  that  if  the  solution  proposed,  for  H  is  nil,  then  the  AS 
replaces  some  attributes  of  hypothesis  H  by  CSX.  However,  H  is  not  removed 
from  the  iconic/symbolic  database.  The  <action>  part  is  inserted  again  into 
the  action  list  (its  cause  of  delay  is  H.) 

There  is  yet  another  category  of  actions  which  specifies  the  constraints 
on  the  evaluation  of  multiple  rules.  We  describe  this  type  by  an  example. 

Any  instance  of  a  HOUSE-GROUP  frame  can  be  “along”  at  most  one 
ROAD  instance.  Given  a  HOUSE-GROUP  instance,  say  IfjG,  we  may  not  yet 
know  the  location  of  the  road  along  ,i.e.,  at  location  Ft  or  at  location  Fr 
(see  Figure  4-10).  One  strategy  is  to  create  hypotheses  about  a  ROAD  at 


both  locations.  However,  once  one  hypothesis  is  verified,  the  other  hypothesis 
must  be  removed. 

The  above  knowledge  is  represented  as  follows: 


Rule  Rv 

<  control-condition  >  :  true, 

<  hypothesis  >  :  Hl  =  F/self), 
<action>  :  self.along-road  =  Hv 


Rule  R2. 

< control-condition >  :  true 

<  hypothesis  >  :  H2  —  Fr(self), 

<  action >  :  self.along-road  =  H2. 


In  addition,  the  following  rule  for  the  HOUSE-GROUP  frame  constrains  the 
simultaneous  evaluation  of  RVR2: 


Rule  Rcontroi. 

<  control-condition  >  : 

not-null(anyone(Fl,i?2))J 

<  hypothesis >  :  nil, 

<  action  >  : 

remove-all(anyone(iZ1,/22)). 

where  anyone(i?1,i?2)= 
if  is-evaluated^J  then  R2 
else  if  is-evaluated(i?2)  then  Rl 
else  nil 


The  above  rule  specifies  that  whenever  one  of  the  <action>  parts  of  the 
rules  i?!  or  R2  is  evaluated,  rule  Rcontro\  is  evaluated  which  causes  the  removal 


of  all  the  hypotheses  that  are  created  by  the  evaluation  of  /?,.< hypothesis > 


or  R2.  <  hy pot  hesis  > . 

Suppose  a  HOUSE-GROUP  instance  is  created.  The  instance  activates 
rules  Rx  and  R2  and  generates  two  hypotheses  about  the  ROAD  object. 
Whenever  the  SG  proposes  a  ROAD  instance  to  one  of  the  hypotheses,  the  AS 
evaluates  one  of  the  delayed  actions,  and  causes  the  removal  of  the  other 
hypothesis. 

We  summarize  the  actions  discussed  in  this  section  in  Table  4-3. 

4.6.  Pursuing  alternative  hypotheses 

It  is  possible  that  several  hypotheses  may  be  generated  at  the  same  time. 
This  can  be  represented  as  the  following  rule: 


Table  4-3.  A  taxonomy  of  actions 


Action  Tvpe 

Example 

Attributes 

Filling  in  of  attributes  in  an  instance. 

Instances 

Create  instances. 

Unify  instances. 

Hypotheses 

Remove  hypotheses. 

Update  hypotheses. 

Rules  |  Constrain  the  simultaneous  evaluation 

of  several  rules. 


if  < control-condition >  then 

<  hypothesis  1>  <  action  1> 
or 

<  hypothesis  2>  <  action  2> 
or 

< hypothesis  n>  <  action  n> 

Whenever  < control-condition >  evaluates  to  true,  all  of  the  <hypothesis>s 
can  be  generated.  These  hypotheses  are  called  alternative  hypotheses  and  we 
assume  that  at  most  one  of  the  hypotheses  is  in  fact  true.  However,  it  is 
difficult  to  decide  which  one  should  be  pursued  first,  since  a  promising  selec¬ 
tion  may  turned  out  to  be  incorrect  as  new  facts  (generated  by  resegmenta¬ 
tion)  are  obtained. 

In  SIGMA,  all  the  alternative  hypotheses  are  generated  and  participate  in 
the  hypothesis  integration  process.  However,  the  associated  actions  of  these 
alternative  hypotheses  are  not  evaluated  (put  in  the  delayed-action  queue). 
When  any  one  of  the  alternative  hypotheses  is  verified,  it  is  left  to  the  associ¬ 
ated  action  to  decide  whether  other  alternative  hypotheses  should  be  pruned. 
On  the  one  hand,  this  strategy  allows  multiple  alternative  hypotheses  to  be 
pursued  simultaneously.  On  the  other  hand,  expert  domain  knowledge,  which 
can  be  described  in  a  rule,  can  be  used  to  prune  unpromising  hypotheses  when 
enough  facts  are  known. 


4.7.  The  selection  of  good  interpretations 

Potentially,  SIGMA  could  construct  all  possible  interpretations  for  the 
image.  It  is  natural  to  require  that  no  region  be  interpreted  as  two  different 
objects  in  the  scene  model.  However,  in  SIGMA,  a  region  may  be  interpreted 
as  several  objects  (e.g.,  an  elongated  region  might  be  interpreted  both  as  a 
road  or  a  driveway).  Intersecting  image  structures  may  be  used  to  construct 
DE’s  whose  iconic  descriptions  should  never  intersect.  A  pair  of  DE’s  whose 
iconic  descriptions  intersect  while  the  scene  model  specifies  otherwise  are 
called  conflicting  DE’s.  The  associated  interpretations  are  called  alternative 
interpretations. 

For  a  set  of  conflicting  DE’s,  we  need  to  select  a  DE  which  “best”  inter¬ 
prets  the  image.  It  is  possible  to  design  an  algorithm  to  select  such  “best” 
interpretations.  However,  we  did  not  investigate  this  issue  in  SIGMA. 
Instead,  we  model  the  final  selection  process  as  a  database  query  answering 
process.  A  program  (QAM)  was  developed  to  answer  simple  queries  about 
DE’s  in  the  interpretation  network  and  to  display  the  iconic  descriptions  of 


the  DE’s  selected. 


5.  Examples 


5.1.  Introduction 

This  section  presents  detailed  examples  of  the  application  of  SIGMA  to 
the  analysis  of  a  high  resolution  aerial  image  to  locate  houses,  roads,  and 
driveways  in  a  suburban  scene. 

We  first  present  an  example  of  the  initial  segmentation  process.  Then  we 
discuss  how  the  HLVS  analyzes  the  image  in  several  typical  situations. 
Finally,  we  show  the  results  of  analysis  by  SIGMA  on  an  aerial  image. 

5.2.  Initial  segmentation 

The  image  used  in  the  example  is  a  250  *  140  window  of  an  aerial  image 
(Figure  5-1)  with  intensities  in  the  range  of  0  to  63.  The  scene  contains 
houses,  roads,  and  driveways. 

5.2.1.  Initial  segmentation  goals 

We  want  to  locate  houses  and  roads  in  the  image.  Since  their  appear¬ 
ances  are  either  compact  rectangles  or  elongated  rectangles,  and  they  are  usu¬ 
ally  brighter  than  the  background,  the  following  hypotheses  are  used  as  the  I- 
set  of  the  initial  segmentation  process: 


/*  extract  compact  and  bright  rectangles  */ 
hypothesis  Hbiob: 

target  object  =  rectangle, 
in-window  =  whole  image, 
rectangle.elongatedness  <  10, 
rectangle.compactness  <  18, 
rectangle. region-contrast  >  3, 

180  <  rectangle. area-of  <  360. 


/*  extract  bright  and  elongated  rectangles  */ 
hypothesis  Hriibon: 

target  object  =  rectangle, 
in-window  =  whole  image, 

8  <  rectangle.width  <  20 
rectangle.elongatedness  >  10, 
rectangle. length  >  10, 
rectangle.compactness  >  18, 
rectangle.region-contrast  >  3. 


5.2.2.  Verifying  hypothesis  Hb[ob 

The  Initial  Segmentation  Controller  (ISC)  first  selects  hypothesis  Hblob. 
The  ISC  activates  the  LLVS  to  compute  image  primitives  that  satisfy 
hypothesis  Hblob.  The  LLVS  selects  the  following  segmentation  operators 
arranged  in  descending  order  of  their  priorities  as  follows: 

Blob  finder 

Upper  threshold  method 


The  Ribbon  finder  and  the  Lower  threshold  method  are  not  selected  since 


their  selection  criteria  evaluate  to  false. 


The  LLVS  activates  the  Blob  finder  first.  The  Blob  finder  first  convolves 
the  original  image  with  a  Laplacian  operator.  Then  it  computes  the  positive 
connected  regions  in  the  convolved  image  (Figure  5-2).  The  regions  computed 
by  the  Blob  finder  which  satisfy  the  constraints  of  Hbiob  are  shown  in  Figure 
5-3. 

Since  the  set  of  results  computed  by  the  Blob  finder  is  not  empty,  the 
LLVS  returns  the  computed  regions  to  the  HLVS.  The  Upper  threshold 
method  is  not  evaluated. 

5.2.3.  Verifying  hypothesis  Hribhon 

The  ISC  then  selects  hypothesis  Hribbon.  The  ISC  activates  the  LLVS  to 
compute  regions  which  satisfy  hypothesis  Hribbon.  The  segmentation  operators 
selected  by  the  LLVS  for  this  task  arranged  in  descending  order  of  their  prior- 
ities  are  as  follows: 

Ribbon  finder 
Upper  threshold  method 

The  Blob  finder  and  the  Lower  threshold  method  are  not  selected  since  their 
selection  criteria  evaluate  to  false. 

The  LLVS  activates  the  Ribbon  finder  first.  The  Ribbon  finder  first  com- 

i 

putes  the  skeletons  of  the  positive  regions  shown  in  Figure  5-2.  The  resulting 
skeletons  are  shown  in  Figure  5-4. 


Finally,  the  Ribbon  finder  decomposes  these  skeletons  and  computes  the 
skeletons  for  the  ribbons.  Figure  5-5  shows  the  skeletons  of  the  ribbons  com¬ 
puted  by  the  Ribbon  finder  which  satisfy  the  constraints  of  hypothesis  Hribbgn. 

Since  the  set  of  results  computed  by  the  Ribbon  finder  is  not  empty,  the 
LLVS  returns  the  computed  regions  to  the  HLVS.  The  Upper  threshold 
method  is  not  evaluated. 

5.2.4.  Generating  instances 

The  ISC  collects  the  results  computed  by  the  LLVS,  creates  RECTAN¬ 
GLE  instances,  and  inserts  them  into  the  iconic/symbolic  database. 

There  are  26  RECTANGLE  instances  created  at  this  stage.  Figure  5-6 
shows  the  iconic  descriptions  of  these  instances.  Note  that  some  of  the 
instances  intersect. 

5.3.  Constructing  partial  interpretations 

A  situation  is  classified  into  one  of  the  following  classes  based  on  how  the 
Solution  Generator  computes  its  proposed  solution: 

Case  1:  The  SG  discovers  an  existing  instance  in  the  iconic/symbolic  database 
which  satisfies  the  given  composite  hypothesis. 

Caste  2:  The  SG  cannot  find  any  instance  in  the  iconic/symbolic  database 
which  satisfies  the  given  composite  hypotheses.  The  composite  hypothesis  is 
non-primitive. 
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Case  3:  The  SG  cannot  find  any  instance  in  the  iconic/symbolic  database 
which  satisfies  the  given  composite  hypothesis.  The  composite  hypothesis  is 
primitive. 

Case  4:  The  SG  obtains  the  solution  from  the  previous  iteration  (i.e.,  the  solu¬ 
tion  for  an  unconcluded  situation  is  now  computed.) 

5.3.1.  Case  1-- Discovering  an  existing  instance 

Consider  the  situation  shown  in  Figure  5-7.  The  relations  between  the 
DE’s  shown  in  this  figure  are  described  in  Table  5-1. 

Figure  5-8  shows  the  portion  of  the  interpretation-network  which  is 
related  to  this  situation. 

Assume  the  focus  of  attention  mechanism  selects  situation  whose  P- 
set  is  as  follows: 

|  DEy,DE^,DE^ . 

Suppose  the  composite  hypothesis,  say  CHa,  computed  for  St  is  : 
target  object  =  ROAD, 

Since  the  P-set  of  the  situation  contains  an  instance,  DEr ,  the  SG  proposes 
it  as  the  solution  to  the  composite  hypothesis  constructed  for  this  situation. 


The  AS  activates  those  actions  whose  causes  of  del  are  DEX  and  DE^  respec¬ 
tively.  Figure  5-9  shows  the  resulting  interpretation  network.  Note  that 
hypotheses  DE2  and  DE4  are  removed.  This  is  caused  by  a  control  rule  in  the 
HOUSE-GROUP  frame  which  specifies  that  each  HOUSE-GROUP  instance 
can  be  along  at  most  one  road. 

5.3.2.  Case  2— Decomposing  a  hypothesis 

Consider  the  situation  shown  in  Figure  5-10.  The  relations  between  the 
DE’s  shown  in  this  figure  are  described  in  Table  5-2. 

Figure  5-11  shows  a  portion  of  the  interpretation  network  related  to  this 
situation. 

Assume  the  focus  of  attention  mechanism  selects  the  situation  Sl  whose 
P-set  is 

{  DEVDE^ . 

Assume  the  composite  hypothesis,  say  CHa,  computed  for  is 
target  object  =  DRIVEWAY, 

i 

The  SG  cannot  find  any  existing  instance  that  satisfies  CHa.  Since  CHa  is 
non-primitive,  the  AS  marks  it  as  partially  processed  and  inserts  it  into  the 


iconic/symbolic  database. 

At  the  subsequent  iterations,  CHa  activates  the  rule  RfiTat-OTder-properties  °f 
frame  DRIVEWAY  to  generate  hypothesis  DE3: 

databaseentity  DE3 : 

target  object  :  RECTANGLE, 

end-database-entity. 

Suppose  the  action  which  is  delayed  by  DE3  is  Afiret_0Tier_pT0perties.  We  will 
revisit  this  example  in  Section  5.3.4.  Note  that  DE3  can  participate  in  the 
formation  of  situations  with  other  DE’s  in  the  iconic/symbolic  database.  Fig¬ 
ure  5-12  shows  the  resulting  interpretation  network  after  DE3  and  CHa  are 
inserted  into  the  iconic/symbolic  database.  Note  that  CHa  is  marked  as  par¬ 
tially  processed  hypothesis.  Table  5-3  summarizes  the  relations  between  the 
DE  S,  action  ■A.fiTat-ordeT-proptTtie»i 

5.3.3.  Case  3— Directing  the  segmentation 

Suppose  the  composite  hypothesis,  say  CHa,  given  to  the  SG  is  primitive. 
The  SG  activates  the  LLVS  to  compute  regions  which  satisfy  the  constraints 
provided  by  the  SG.  The  regions  computed  by  the  LLVS  are  used  by  the  SG 
to  create  RECTANGLE  instances.  The  SG  then  proposes  those  created 
instances  which  satisfy  the  constraints  of  CHa  as  solutions.  If  no  instance  is 
computed,  the  SG  proposes  nil  as  the  solution.  We  illustrate  the  process  used 


by  our  system  in  the  following  two  examples. 

Suppose  the  composite  hypothesis,  say  CHa,  given  to  the  SG  is: 

target  object  =  RECTANGLE, 
in  window  :  Wv 
rectangle.elongatedness  <  10, 
rectangle. compactness  <  18, 

275  <  rectangle.area-of  <  325. 

The  window  VVj  is  shown  in  Figure  5-13. 

The  LLVS  first  activates  the  Blob  finder  and  fails  to  compute  any  region. 
Then  the  LLVS  activates  the  Upper  threshold  method  to  compute  regions.  A 
region  is  successfully  computed  by  setting  the  threshold  value  at  24.  Figure 
5-14  shows  some  of  the  intermediate  results  of  the  segmentation  process.  The 
measurements  (the  area  and  the  compactness  of  a  region)  are  shown  for  the 
largest  region  extracted  at  each  specified  threshold  value. 

The  LLVS  returns  the  computed  region  to  the  SG.  The  SG  checks  the 
features  of  the  region  and  creates  a  RECTANGLE  instance  DEFECT  and  pro¬ 
pose  it  as  the  solution.  Figure  5-15  shows  the  RECTANGLE  instance  created 
by  the  SG. 

Suppose  the  composite  hypothesis  CHa  is  again  given  to  the  SG.  How¬ 
ever,  the  window  Wx  is  as  shown  in  Figure  5-16. 

a 

The  LLVS  activates  the  Blob  finder,  the  Upper  threshold  method,  and 
the  Lower  threshold  method  and  cannot  compute  any  region  which  satisfies 


the  given  constraints.  The  LLVS  returns  “nil”  to  the  SG.  The  SG  then  pro¬ 
poses  nil  as  the  solution. 

5.3.4.  Case  4— Analyzing  an  unconcluded  situation 

Consider  the  interpretation  net  ./ork  shown  in  Figure  5-12.  Suppose  that 
at  some  other  iteration  the  SG  computes  a  solution,  say  /0,  for  DE3.  Action 
A firet-order-properties  »  now  evaluated  by  the  AS. 

Two  possible  outcomes  can  be  produced  by  the  evaluation  of 

Afirst-order-properti's •  First>  the  evaluation  of  action  Aj int-order-properti"  generates 
a  solution,  say  Iv  for  CHa.  This  causes  the  HLVS  to  analyze  the  unconcluded 
situation  Sl  in  the  next  iteration.  The  SG  will  propose  Ix  as  the  solution  to 
CHa,  the  composite  hypothesis  of  Sx. 

Figure  5-17  shows  the  resulting  interpretation  network  in  this  case.  The 
unconcluded  situation  Sv  the  partially  processed  hypothesis  CHa,  and  the 
hypothesis  DE3  generated  by  the  “decomposition  method”  are  all  removed. 

Second,  suppose  no  solution  is  generated  by  the  evaluation  of 

A first-order-propertier  Instead,  the  evaluation  cause  changes  to  be  made  to  the 
attributes  of  DE3.  In  this  case,  situation  is  removed  from  the  situation  lat¬ 
tice  and  new  situations  are  constructed.  Suppose  DE3a  is  the  updated 
hypothesis.  Figure  5-18  shows  the  resulting  interpretation  network  in  this 


case. 


5.4.  A  complete  example 

In  this  section,  we  present  the  result  of  applying  our  image  interpretation 
program  to  the  image  shown  in  Figure  5-1.  No  explicit  goal  is  given  to  the 
system.  The  analysis  terminates  when  all  the  hypotheses  created  are  verified 
or  refuted. 

Figure  5-6  shows  the  RECTANGLE  instances  generated  by  the  initial 
segmentation  process.  Figure  5-19  shows  those  RECTANGLE  instances 
which  are  interpreted  as  RECTANGULAR-HOUSE  instances  (requiring  that 
200<rectangle.area-of<400)  ,  and  Figure  5-20  shows  those  RECTANGLE 
instance  which  are  interpreted  as  VISIBLE-ROAD-PIECE  instances  (requiring 
that  6<rectangle.width<12).  No  RECTANGLE  instances  are  interpreted  as 
DRIVEWAY  instances. 

Instead  of  showing  the  processing  of  each  situation  by  the  program,  we 
show  only  the  processing  of  several  interesting  situations. 

In  the  scene  model,  two  HOUSE-GROUP  instances  are  identical  if  they 
both  share  a  common  HOUSE  instance  and  should  be  unified  to  a  single 
instance.  Figure  5-21(a)  shows  such  an  example.  Let  Px  and  P2  denote  two 
HOUSE  instances,  Rx  and  R2  two  HOUSE-GROUP  instances,  and  DE{  a 
HOUSE  hypothesis. 


Each  HOUSE-GROUP  instance  creates  hypotheses  about  more  houses 
that  belong  to  it.  The  process  to  unify  the  house  groups  is  as  follows: 


(1)  The  situation  whose  P-set  is 


{o£„P2J 

is  selected  by  the  focus  of  attention  mechanism. 

(2)  SG  proposes  HOUSE  instance  P2  as  the  solution  to  the  composite 
hypothesis  of  situation  Sv  The  evaluation  of  the  action  which  is  delayed  by 
DEi  fills  P2  in  the  “contains”  slot  of  HOUSE-GROUP  instance  Ry. 

(3)  Since  Py  “belongs  to”  two  HOUSE-GROUP  instances  at  the  subsequent 
iteration,  the  evaluation  of  a  rule  in  HOUSE  frame  unifies  Ry  and  R2. 

Let  us  denote  the  resulting  HOUSE-GROUP  instance  by  Rv  Figure  5-22 
shows  the  result  of  the  analysis. 

Figure  5-23  shows  another  example.  Resegmentation  of  the  image  is 
required  in  this  example.  Let  R,  denote  a  HOUSE-GROUP  instance,  Pi  a 
HOUSE  instance,  DEi  a  HOUSE  hypothesis.  Also  let  CH{  denote  a  partially 
processed  hypothesis,  and  Tx  a  RECTANGLE  instance.  These  DE’s  are  not 
shown  in  Figure  5-23.  They  are  used  later  in  this  example. 

The  processes  to  activates  the  LLVS  to  process  the  image  are  as  follows: 
(1)  Situation  Sy  whose  P-set  is 

{  DEy,DE^ 

is  selected.  Since  the  composite  hypothesis  (target  object  is  HOUSE  object)  is 
non-primitive,  a  partially  processed  hypothesis,  say  CHy,  is  generated. 


(2)  At  the  next  iteration,  the  evaluation  of  the  rule  R specialization-strategy  °f  the 
HOUSE  frame  generates  a  hypothesis  DE5  whose  target  object  is 
RECTANGULAR-HOUSE  (Figure  5-24(a)). 


(3)  Situation  S2  whose  P-set  contains  DE5  is  selected.  Again,  a  partially- 
processed  hypothesis,  say  CH2,  about  RECTANGULAR-HOUSE  is  generated. 


(4)  At  the  following  iteration,  the  evaluation  of  the  rule  R first-order-properties  ° f 
RECTANGULAR-HOUSE  frame  generates  a  hypothesis  DE&  whose  target 
object  is  RECTANGLE  (Figure  5-24(b)). 


(5)  The  SG  activates  the  LLVS  to  segment  the  image.  A  region  is  computed 
by  the  LLVS  (see  Figures  5-13,  14,  15).  The  SG  creates  a  RECTANGLE 
instance  Tx. 

(6)  The  evaluation  of  the  <  action  >  of  R first-order-properties  creates  a 
RECTANGULAR-HOUSE  instance  P4.  Since  a  solution  is  now  ready  for  the 
unconcluded  situation  S2,  the  HLVS  schedules  it  to  be  processed  next.  After¬ 
wards,  since  a  solution  is  now  ready  for  the  unconcluded  situation  Sv  the 
HLVS  schedules  it  to  be  processed  next.  Now,  the  actions  delayed  by  DEX 
and  DE$  can  be  evaluated.  The  resulting  interpretation  network  is  shown  in 
Figure  5-24(c). 


(7)  P4  “belongs  to”  two  HOUSE-GROUP  instances.  At  the  subsequent  itera¬ 
tion,  the  evaluation  of  a  rule  in  the  HOUSE  frame  unifies  RX  and  R2. 

Figure  5-25  shows  the  resulting  HOUSE-GROUP  instance. 

In  the  scene  model,  every  ROAD  instance  is  smoothly  extended  from  one 
ROAD-TERMINATOR  instance  to  another  ROAD- TERMINATOR  instance. 
A  ROAD- TERMINATOR  is  defined  to  be  the  boundary  of  the  image.  We 
present  an  example  in  the  following  paragraphs. 
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The  extension  of  ROAD  instances  is  similar  to  the  merging  of  two 
HOUSE-GROUP  instances  discussed  above.  Figure  5-26  shows  two  ROAD 
instances  Rx  and  R2.  P\  and  P2  are  two  ROAD-PEECE  instances.  DEi  denotes 
a  ROAD-PEECE  hypothesis.  The  extending  of  ROAD  instance  Rx  activates 
the  merging  of  R x  and  R2  into  one  ROAD  instance  (Figure  5-27). 

Figure  5-28  shows  another  case.  Rx  and  R2  are  two  ROAD  instances.  DEX 
is  a  ROAD-PIECE  hypothesis  generated  by  Rx.  Since  R2  is  not  “connected” 
to  Rlt  hypothesis  DEX  is  modified  as  shown  in  Figure  5-29. 

Figure  5-30  shows  yet  another  case.  Road  instance  Rx  cannot  be  extended 
any  longer.  When  this  is  detected,  the  original  ROAD-PIECE  hypothesis  is 
removed  and  a  ROAD- TERMINATOR  hypothesis  is  generated. 

Figure  5-31  shows  another  example.  Let  DET  denote  a  ROAD  instance, 
DEh  a  HOUSE  instance,  DEre  a  RECTANGLE  instance,  and  DEX  a  DRIVE¬ 
WAY  hypothesis.  House  instance  DEh  and  ROAD  instance  DEr  create 
hypotheses  DEX  and  DE2  about  the  DRIVEWAY  object  respectively.  There  is 
no  DRIVEWAY  instance  in  the  iconic/symbolic  database  which  satisfies  these 
hypotheses.  However,  there  is  a  RECTANGLE  instance,  DEre,  which,  if  inter¬ 
preted  as  a  DRIVEWAY  object,  would  satisfy  these  hypotheses.  Note  that 
DEre  is  not  interpreted  as  a  DRIVEWAY  object,  a  VISIBLE-ROAD-PIECE, 
or  a  RECTANGULAR-HOUSE  since  there  are  not  enough  distinguishing 
features  of  DEre  to  make  these  interpretations. 


The  HLVS  performs  the  analysis  as  follows: 

(1)  A  composite  hypothesis,  CHX,  is  first  constructed  for  the  situation  whose 
P-set  is 


(2)  A  hypothesis,  DE3,  about  the  RECTANGLE  object  is  created  by  the  com¬ 
posite  hypothesis  CHX. 


(3)  DErt  satisfies  DE%.  A  DRIVEWAY  instance  DEir  is  created  by  the 
<  action  >  part  of  the  rule  Rfir3t-0rder-properties  DRIVEWAY  frame.  The 

DRIVEWAY  instance  DE^  satisfies  both  DEX  and  DE2.  Figure  5-32  shows  the 
resulting  interpretation  network  after  DEX  and  DE2  are  removed. 

The  resulting  interpretation  network  is  shown  in  Figure  5-33.  The  iconic 
descriptions  of  the  instances  created  during  the  analysis  are  shown  in  Figures 
5-34  and  5-35. 

Finally,  we  present  two  examples  of  the  final  selection  stage  of  the  pro¬ 
gram.  Figure  5-36(a)  shows  a  ROAD  instance  whose  length  is  longer  than  100. 
Instances  of  related  objects  are  shown  in  Figure  5-36(b),(c),  and(d). 

Figure  5-37(a)  shows  a  HOUSE-GROUP  instance  with  more  than  four 


houses.  Instances  of  related  objects  are  shown  in  Figure  5-37(b)  and  (c). 


6.  Conclusions 


This  paper  has  described  a  model  for  the  development  of  image  under¬ 
standing  systems  that  involves  representing  scene  domain  knowledge  using 
frames  and  controlling  the  actions  of  the  system  by  hypothesis  integration. 
Using  such  a  framework,  we  developed  a  flexible  image  understanding  system 
called  SIGMA  which  performs  both  top-down(goal-oriented)  image  analysis 
and  bottom-up  construction  of  composite  image  structures,  and  demonstrated 
the  system’s  performance  on  an  aerial  image  of  a  suburban  scene. 

Developing  computer  systems  for  visual  applications  is  one  way  to  inves¬ 
tigate  how  humans  see,  and  also  to  make  computers  more  useful.  As  pointed 
out  by  many  researchers  [Hall79],  [Binf82],  ima^  analysis  systems  usually 
consist  of  several  types  of  modules:  low  level  vision  modules(e.g.,  segmenta¬ 
tion)  and  high  level  vision  modules(e.g.,  matching,  inference).  This  research 
leads  to  the  conclusion  that  a  powerful  vision  system  should  rely  on  a  balance 
of  performance  between  these  two  types  of  modules.  The  low  level  modules 
should  provide  descriptive  information  about  the  image  to  the  high  level 
modules  and  the  high  level  modules  should  provide  “hints”  about  image 
structures  to  the  low  level  modules.  This  research  is  only  a  small  step  toward 
the  construction  of  general  vision  systems. 
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Scene  Model 


An  Interpretation 


Legend : 

Object  :  □ 
Instance  of  Object 
Mappings  : - 


C 


Figure  1-1.  Mappings  between  the  scene  and  the  image 


frame  RECTANGULAR-HOUSE; 
n tie*: 

Ff teUngU  » 

links: 

AKO :  HOUSE; 

end  - frame 

frame  L-SHAPED-HOUSE; 
rules: 

Fl-'Up*  ; 

/mis : 

AKO :  HOUSE; 

end -frame 

frame  HOUSE; 
slots : 

centroid; 

shape-description; 

front-of-house; 

connecting-drireway; 

rules: 
links  ^ 

CAN-BE  RECTANGULAR-HOUSE,  L- 
SHAPED-HOUSE; 

end -frame 

Figure  2-1  Frame  definitions  for  HOUSE,  RECTANGULAR-HOUSE 
and  L-SHAPED-HOUSE. 


Links: 
AKO  : 
CAN-BE 


"Figure  2-2  Links  between  HOUSE,  RECTANGULAR-HOUSE 
and  L-SHAPED-HOUSE  frames. 


RECTANGLE 

Legends : 

AKO  links 
CAN-BE  links 
rules 


Figure  2-3  A  model  of  a  suburban  housing  development. 


Figure 


2-4.  Pictorial  description  of  house-road  relati 


RELfOj.O*)? 


Figure  3-1.  Using  a  relation  as  a  constraint. 


Figure  3-2(a).  The  situation  lattice  before  the  insertion. 


iconic  descriptions 


situation  lattice 


Figure  3-2(b).  The  situation  lattice  after  the  insertion. 


Figure  4-1.  The  stages  of  the  control  of  SIGMA. 


Figure  4-2.  The  schematic  diagram  of  the  initial  segmentation 
process. 


fill  Hl  in  instance  HG{ 


a  house  group  instance  generate  hypothesis 
containing  H0  b  created  about  possible  house  in 

HG  o 

Figure  4-4(a).  Reasoning  steps  for  constructing  HGQ. 


a  house  group  instance  generate  hypothesis  fill  H$  in  instance  HG 

containing  H  \  \s  created  about  possible  house  in 

HGl 


Figure  4-4(b).  Reasoning  steps  for  constructing  HG-, . 


decomposition 


Target  object  of  CHt : 

RECTANGULAR-HOUSE 

Target  object  of  H9: 

RECTANGLE 

Delayed-action: 

if  H=  nil  then  conclude(nil) 

else  conclude(inake-mstance(RECT ANGLE- HOUSE,# )). 


Figure  4-7.  Decomposition  of  CH, 


Legends 

unconcluded  situation 


ion: 

3  hypotKesi 


partially  processed  hypothesis 

Figure  4-8.  The  resulting  situation  lattice. 


unconcluded  situation:  ^ 
partially  processed  hypo 


Figure  4-9.  The  situation  lattice  after  actions  are  evaluated. 


Figure  4-10.  Possible  road  locations  along  I,  „ . 


Figure  5-1.  An  aerial  image. 


Figure  5  2.  Positive  "-nc.-: '.vd  figure  5-3.  Blobs  extracted  by 
regions .  Blob-finder. 


Figure  5-4.  Skeletons  of  the 
connected  components. 


Figure  5-5.  Skeletons  of  the  ribbons 
extracted  by  Ribbon-finder. 


Figure  5-6.  Iconic  descriptions  of  the  RECTANGLE  instances  generated 
based  on  the  initial  segmentation  process. 


t-igure  5-10(a).  An  example  (see 
Section  5.3.2. ) 


Figure  5-1 0 ( 
situation. 


DE’s 

Type 

Generated-by 

DEr 

ROAD  instance 

DEk 

HOUSE  instance 

DEi 

DRIVEWAY  hypothesis 

DEk  " 

DEt 

DRIVEWAY  hypothesis 

DE, 

lable  5-2.  The  descriptions  of  the  DE's. 


Figure  5-11.  Portion  of  the 
interpretation  related  to  the 
situation. 


Figure  5-12.  Resulting  interpretati 
network. 
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_ £i_ 
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Table  5-3.  Relations  between  the  DE's,  action  -jrst_orc|er-properties ’ 
and  . 


Figure  5-13.  A  window  generated  by  the  HLVS. 
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Figure  5-14.  Intermediate  results  of  the  LLVS. 


Figure  5-15.  The  RECTANGLE  instance  generated  by  tne  HLVS  (based 
the  results  computed  by  the  LLvS). 


Figure  5-17.  Resulting  interpretation  Figure  5-18.  Resulting  interpretation 
network  (when  a  solution  has  been  network  (when  no  solution  has  been 

generated ) .  computed ) . 


Figure  5-19.  initial  set  of  RECTANGULAR-HOUSE  instances. 


Figure  5-20.  Initial  set  of  VISIBLE-ROAD-PIECE  instances. 


Figure  6-21 (a).  Two  HOUSE-GROUP 
instances  (see  Section  5-4). 


Figure  5-21 (b).  Portion  of  the  interpretation 
network  related  to  the  situation. 


Figure  5-22 (a )  Resulting  HOUSE-GROUP  Figure  5-22 ( b ) .  Hypotheses  generated 
instance  R-j .  by  R-j . 


Figure  5-23(a).  Two  HOUSE-GROUP 
instances  (see  Section  5-4). 


Figure  5-23 ( b ) .  Portion  of  the  interpretation 
network  related  to  the  situation. 
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Figure  5-2b(a).  Resulting  HOUSE-GROUP 
instance. 


Figure  5-25 ( b)  Resulting  interpretation 
network. 


Figure  5-26(a).  Two  ROAD  instances 
(see  Section  5-4). 


Figure  5-26(b).  Portion  of  the  inter¬ 
pretation  network  related  to  the  situation 


Figure  5-27(a).  Resulting  ROAD 
instance. 


Figure  5-27 (b) .  Resulting  interpretation 
network. 


Figure  b-28(a).  Two  ROAD  instances 
(see  Section  S-4). 


Figure  5-28(b).  A  depiction  of  the 
situation. 


Figure  b-29.  Hypothesis  DE1  has  been  modified. 


Figure  5-30.  A  ROAD-TERMINATOR  hypothesis  has  been  generated 


o 


«  C 
••  U  0 
•g  c  *h 

C  0  4J 
®  P  C 
tr>  to  r-i 

sc® 

(4  H  K 


0 

U 

c 

0 

P 

n  0 
«  c  o 
o  e 

C  0 

0  0  P 
■pun 
n  0  c 

C  *H 

•H  0< 

I  0 
•O  lD  B 
0  0  3 

0  0  0 
05  05  ffi 


05  ft  E 


0 

O 

c 

0 

0  p 

u  n 

c  c 

0  0  -H 

P  u 
n  c  n 
c  0  o 

M-t  P  P 

n  0 

a  c  c 

3  -W 

0  E 

U  >i  P 
U  0  0 
I  »  Eh 
0  0  I 

n  >  *o 

3  -H  0 

o  n  o 

33  Q  05 

••  ••  H 

O  Q  Eh 


n  0 
®  u 
U  3 

c  Cn 

0  -iH 
P  <P 

n 

c  0 

-H  £ 
P 

0 

rH  C 

O'  -H 

c 

5  i 

o  o 
5  -c 

05  m 


/"o'\ 

//  (  <3\  ^ 

0- 

-v£/  / 

0- 

-(?) 

Figure  5-33.  Final  interpretation  network. 


Figure  5-36.  Explanation  of  a  ROAD  instance. 


Figure  5-37.  Explanation  of  a  HOUSE  GROUP  instance 


